6 research outputs found

    Advanced machine learning algorithms for discrete datasets

    Get PDF

    IDENTIFICATIONOFINFLUENTIALUSERSIN SPEECH-BASEDNETWORKS

    No full text
    Online social networks act as good mediums for communication but are also becoming popular for targeting the social needs of their users. Mainstream social networks are still unable to incorporate lowliterate users into their user-base as their interfaces are on the web or in Short Message Service (SMS) format, while low-literate people constitute a major portion of world’s population. Speech-based networks (SBNs) overcome these limitations by providing a simple speech-based interface for users. In this work, we present a systematic analysis of SBNs designed specifically for low-literate users with a focus on identification of influential users. The task of finding influential users has not been studied for SBNs. Furthermore, knowledge of influential users can help optimize the operation of SBNs that typically run in low-income regions with low budgets. We demonstrate how a SBN is formed from call data records and define its key features or characteristics. We then propose a feature-based method for influence ranking in a SBN. Existing methods for influence maximization in social networks are not directly applicable to SBNs. Hence, we present a method for calculating influence probabilities between users in the network, enabling the application of the greedy algorithm and degree discount heuristic for influence maximization and computation of betweeness centrality in a SBN. We evaluate our methodology on data from a real-world SBN called Polly. We compare the results with those from existing methods and show that our methods are both effective and time-efficient for use in SBNs

    Exploiting reject option in classification for social discrimination control

    No full text
    Social discrimination is said to occur when an unfavorable decision for an individual is influenced by her membership to certain protected groups such as females and minority ethnic groups. Such discriminatory decisions often exist in historical data. Despite recent works in discrimination-aware data mining, there remains the need for robust, yet easily usable, methods for discrimination control. In this paper, we utilize reject option in classification, a general decision theoretic framework for handling instances whose labels are uncertain, for modeling and controlling discriminatory decisions. Specifically, this framework permits a formal treatment of the intuition that instances close to the decision boundary are more likely to be discriminated in a dataset. Based on this framework, we present three different solutions for discrimination-aware classification. The first solution invokes probabilistic rejection in single or multiple probabilistic classifiers while the second solution relies upon ensemble rejection in classifier ensembles. The third solution integrates one of the first two solutions with situation testing which is a procedure commonly used in the court of law. All solutions are easy to use and provide strong justifications for the decisions. We evaluate our solutions extensively on four real-world datasets and compare their performances with previously proposed discrimination-aware classifiers. The results demonstrate the superiority of our solutions in terms of both performance and flexibility of applicability. In particular, our solutions are effective at removing illegal discrimination from the predictions. (C) 2017 Published by Elsevier Inc

    Layered convolutional dictionary learning for sparse coding itemsets

    No full text
    Dictionary learning for sparse coding has been successfully used in different domains, however, has never been employed for the interesting itemset mining. In this paper, we formulate an optimization problem for extracting a sparse representation of itemsets and show that the discrete nature of itemsets makes it NP-hard. An efficient approximation algorithm is presented which greedily solves maximum set cover to reduce overall compression loss. Furthermore, we incorporate our sparse representation algorithm into a layered convolutional model to learn nonredundant dictionary items. Following the intuition of deep learning, our convolutional dictionary learning approach convolves learned dictionary items and discovers statistically dependent patterns using chi-square in a hierarchical fashion; each layer having more abstract and compressed dictionary than the previous. An extensive empirical validation is performed on thirteen datasets, showing better interpretability and semantic coherence of our approach than two existing state-of-the-art methods
    corecore